Detecting Polysemy in Hard and Soft Cluster Analyses of German Preposition Vector Spaces

نویسندگان

  • Sylvia Springorum
  • Sabine Schulte im Walde
  • Jason Utt
چکیده

This paper presents a methodology to identify polysemous German prepositions by exploring their vector spatial properties. We apply two cluster evaluation metrics (the Silhouette Value (Kaufman and Rousseeuw, 1990) and a fuzzy version of the V-Measure (Rosenberg and Hirschberg, 2007)) as well as various correlations, to exploit hard vs. soft cluster analyses based on Self-Organising Maps. Our main hypothesis is that polysemous prepositions are outliers, and thus represent either (i) singletons or (ii) marginals of the clusters within a cluster analysis. Our analyses demonstrate that (a) in a subset of the clusterings, singletons have a tendency to contain polysemous prepositions; and (b) misclassification and cluster membership rate exhibit a moderate correlation with ambiguity rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Semantic Classification of German Preposition Types: Comparing Hard and Soft Clustering Approaches across Features

This paper addresses an automatic classification of preposition types in German, comparing hard and soft clustering approaches and various windowand syntax-based co-occurrence features. We show that (i) the semantically most salient preposition features (i.e., subcategorised nouns) are the most successful, and that (ii) soft clustering approaches are required for the task but reveal quite diffe...

متن کامل

A Rank-based Distance Measure to Detect Polysemy and to Determine Salient Vector-Space Features for German Prepositions

This paper addresses vector space models of prepositions, a notoriously ambiguous word class. We propose a rank-based distance measure to explore the vector-spatial properties of the ambiguous objects, focusing on two research tasks: (i) to distinguish polysemous from monosemous prepositions in vector space; and (ii) to determine salient vector-space features for a classification of preposition...

متن کامل

An Annotation Schema for Preposition Senses in German

Prepositions are highly polysemous. Yet, little effort has been spent to develop languagespecific annotation schemata for preposition senses to systematically represent and analyze the polysemy of prepositions in large corpora. In this paper, we present an annotation schema for preposition senses in German. The annotation schema includes a hierarchical taxonomy and also allows multiple annotati...

متن کامل

Fauna and frequency of hard ticks of livestock in South Khorasan province in 2018: Short Communication

Identification of hard tick species and their hosts are essential for the development of control and prevention programs for tick-borne diseases. In this descriptive cross-sectional study, ticks were collected from the sheep, goat, and camel in different regions of South Khorasan province, Iran in 2018 through cluster sampling method. Fauna and frequency of ticks were recorded and analyzed in S...

متن کامل

Verb polysemy and frequency effects in thematic fit modeling

While several data sets for evaluating thematic fit of verb-role-filler triples exist, they do not control for verb polysemy. Thus, it is unclear how verb polysemy affects human ratings of thematic fit and how best to model that. We present a new dataset of human ratings on high vs. low-polysemy verbs matched for verb frequency, together with high vs. low-frequency and well-fitting vs. poorly-f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013